Simple Semi-supervised Dependency Parsing

نویسندگان

  • Terry Koo
  • Xavier Carreras
  • Michael Collins
چکیده

We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of the approach in a series of dependency parsing experiments on the Penn Treebank and Prague Dependency Treebank, and we show that the cluster-based features yield substantial gains in performance across a wide range of conditions. For example, in the case of English unlabeled second-order parsing, we improve from a baseline accuracy of 92.02% to 93.16%, and in the case of Czech unlabeled second-order parsing, we improve from a baseline accuracy of 86.13% to 87.13%. In addition, we demonstrate that our method also improves performance when small amounts of training data are available, and can roughly halve the amount of supervised data required to reach a desired level of performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simple Semi-supervised Dependency Parsing

We present a simple and effective semisupervised method for training dependency parsers. We focus on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus. We demonstrate the effectiveness of our approach in a series of dependency parsing experiments on the Penn Treebank, and we show that our clusterbased features yiel...

متن کامل

An Empirical Study of Semi-supervised Structured Conditional Models for Dependency Parsing

This paper describes an empirical study of high-performance dependency parsers based on a semi-supervised learning approach. We describe an extension of semisupervised structured conditional models (SS-SCMs) to the dependency parsing problem, whose framework is originally proposed in (Suzuki and Isozaki, 2008). Moreover, we introduce two extensions related to dependency parsing: The first exten...

متن کامل

Semi-Supervised Convex Training for Dependency Parsing

We present a novel semi-supervised training algorithm for learning dependency parsers. By combining a supervised large margin loss with an unsupervised least squares loss, a discriminative, convex, semi-supervised learning algorithm can be obtained that is applicable to large-scale problems. To demonstrate the benefits of this approach, we apply the technique to learning dependency parsers from...

متن کامل

Title of Thesis: Learning Structured Classifiers for Statistical Dependency Parsing Learning Structured Classifiers for Statistical Dependency Parsing

In this thesis, I present three supervised and one semi-supervised machine learning approach for improving statistical natural language dependency parsing. I first introduce a generative approach that uses a strictly lexicalised parsing model where all the parameters are based on words, without using any part-of-speech (POS) tags or grammatical categories. Then I present an improved large margi...

متن کامل

Semi-supervised Dependency Parsing using Bilexical Contextual Features from Auto-Parsed Data

We present a semi-supervised approach to improve dependency parsing accuracy by using bilexical statistics derived from auto-parsed data. The method is based on estimating the attachment potential of head-modifier words, by taking into account not only the head and modifier words themselves, but also the words surrounding the head and the modifier. When integrating the learned statistics as fea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008